Data Representation

Subject: Computer Science
Topic: 4
Cambridge Code: 0478

Number Systems

Binary (Base 2)

Binary - Base 2 numbering system (digits 0-1)

Position values: $2^7, 2^6, 2^5, 2^4, 2^3, 2^2, 2^1, 2^0$ $128, 64, 32, 16, 8, 4, 2, 1$

Example: 10110₂ = 1(16) + 0(8) + 1(4) + 1(2) + 0(1) = 22₁₀

Converting to binary:

Divide by 2 repeatedly
Remainders give binary digits
Read from bottom to top

Hexadecimal (Base 16)

Hexadecimal - Base 16 (digits 0-9, A-F)

Digits: 0,1,2,3,4,5,6,7,8,9,A(10),B(11),C(12),D(13),E(14),F(15)

Position values: $16^3, 16^2, 16^1, 16^0$ $4096, 256, 16, 1$

Example: 2F₁₆ = 2(16) + 15(1) = 47₁₀

Advantages:

Compact representation
Easy conversion to binary
Used for memory addresses, colors

Converting Between Systems

Binary ↔ Hexadecimal:

1 hex digit = 4 binary digits
Group binary in fours
Convert each group

Example: 10110011₂ = B3₁₆

1011₂ = B₁₆
0011₂ = 3₁₆

Data Units

Bit: Single binary digit (0 or 1)

Byte: 8 bits

Size Conversions

Unit	Size
Kilobyte (KB)	1,024 bytes
Megabyte (MB)	1,024 KB
Gigabyte (GB)	1,024 MB
Terabyte (TB)	1,024 GB

Note: Often abbreviated as 1024 ≈ 1000 in casual usage

Calculating Storage

Example: How many bits in 5 MB?

5 MB × 1024 KB/MB × 1024 bytes/KB × 8 bits/byte = 41,943,040 bits

Character Encoding

ASCII (American Standard Code for Information Exchange)

ASCII - 7-bit encoding (128 characters)

Ranges:

0-31: Control characters
32-47: Spaces, punctuation
48-57: Digits 0-9
65-90: Uppercase A-Z
97-122: Lowercase a-z

Example:

'A' = 65 = 01000001₂
'0' = 48 = 00110000₂

Extended ASCII

Extended ASCII - 8-bit encoding (256 characters)

Includes accented characters
Special symbols
Scientific characters

Unicode

Unicode - Universal character set

UTF-8: Variable-length (1-4 bytes)

ASCII compatible
Most common on web

UTF-16: Fixed 2-4 bytes

Used in many applications

Advantages:

Supports all languages
Emojis and special characters
Global compatibility

Image Representation

Bitmap (Raster) Images

Bitmap - Grid of colored pixels

Color representation:

RGB: Red, Green, Blue (each 0-255)
Example: 255,0,0 = Pure red

Color depth:

8-bit: 256 colors
16-bit: 65,536 colors
24-bit: 16.7 million colors

File size calculation: $\text{Size} = \text{width} × \text{height} × \text{color depth}$

Example: 100×100 pixels, 24-bit Size = 100 × 100 × 24 bits = 240,000 bits ≈ 30 KB

Vector Images

Vector - Mathematical descriptions of shapes

Advantages:

Scalable without quality loss
Smaller file sizes (simple shapes)
Resolution independent

Disadvantages:

Complex images not suitable
Less photorealistic

Image Compression

Lossy compression:

Removes data
Smaller file size
Quality degradation
JPEG, MP4

Lossless compression:

No data removal
Larger file size
Perfect restoration
PNG, GIF, ZIP

Sound Representation

Sound Digitization

Sampling - Recording sound at intervals

Sampling rate: How often sound sampled

CD quality: 44.1 kHz
Professional: 48 kHz
Telephony: 8 kHz
Higher rate = better quality

Sample resolution (Bit depth):

8-bit: 256 volume levels
16-bit: 65,536 volume levels
24-bit: 16.7 million levels
Higher = better quality

File size calculation: $\text{Size} = \text{Sampling rate} × \text{Duration (s)} × \text{Bit depth}$

Example: 44.1 kHz, 16-bit, 3 minutes Size = 44,100 × 180 × 16 = 127,008,000 bits ≈ 15.9 MB

Sound Compression

Lossy (MP3, AAC):

Removes inaudible frequencies
10:1 compression ratio typical
Acceptable quality loss

Lossless (FLAC, WAV):

Preserves all data
Larger files
Perfect reproduction

Text Compression

Run-Length Encoding (RLE)

RLE - Replace repeated characters with count + character

Example: AAABBCDDD → 3A2B1C3D

Efficiency depending on data - Very effective for repetitive data

Dictionary Compression

Lempel-Ziv-Welch (LZW):

Replaces repeating sequences with codes
Adaptive dictionary
ZIP files use this

Error Detection and Correction

Parity Bit

Parity - Extra bit for error detection

Even parity: Total 1s (including parity) = even Odd parity: Total 1s = odd

Example (even): 1011010 → 10110101 (add 1)

Checksum

Checksum - Sum of data bits modulo some value

Added to end of data
Receiver verifies by recalculating
Detects transmission errors

Error Correcting Codes

Hamming code:

Detects and corrects single-bit errors
Multiple parity bits at specific positions

Key Points

Binary: Base 2 (0-1)
Hexadecimal: Base 16 (0-9, A-F)
Data units: Bit, byte, KB, MB, GB, TB
ASCII: 7-bit, 128 characters
Unicode: Supports all languages
Bitmap: Pixel-based, color depth matters
Vector: Math-based, scalable
Lossy compression removes data
Lossless compression preserves all data

Practice Questions

Convert binary ↔ decimal ↔ hexadecimal
Calculate file sizes
Explain character encodings
Compare bitmap vs vector
Calculate image/sound file sizes
Apply RLE compression
Detect parity errors

Revision Tips

Practice number conversions
Know data unit relationships
Understand ASCII/Unicode
Know color depth effects
Understand sampling rate importance
Compare compression types
Calculate file sizes accurately

Number Systems​

Binary (Base 2)​

Hexadecimal (Base 16)​

Converting Between Systems​

Data Units​

Size Conversions​

Calculating Storage​

Character Encoding​

ASCII (American Standard Code for Information Exchange)​

Extended ASCII​

Unicode​

Image Representation​

Bitmap (Raster) Images​

Vector Images​

Image Compression​

Sound Representation​

Sound Digitization​

Sound Compression​

Text Compression​

Run-Length Encoding (RLE)​

Dictionary Compression​

Error Detection and Correction​

Parity Bit​

Checksum​

Error Correcting Codes​

Key Points​

Practice Questions​

Revision Tips​

Number Systems

Binary (Base 2)

Hexadecimal (Base 16)

Converting Between Systems

Data Units

Size Conversions

Calculating Storage

Character Encoding

ASCII (American Standard Code for Information Exchange)

Extended ASCII

Unicode

Image Representation

Bitmap (Raster) Images

Vector Images

Image Compression

Sound Representation

Sound Digitization

Sound Compression

Text Compression

Run-Length Encoding (RLE)

Dictionary Compression

Error Detection and Correction

Parity Bit

Checksum

Error Correcting Codes

Key Points

Practice Questions

Revision Tips